Mining : Basic Concepts

نویسنده

  • Douglas W. Oard
چکیده

This survey reviews a broad array of techniques that are becoming available to mine textual data. It presents initially a three function (data collection, data warehousing, data exploitation) text mining architecture consisting of a six step text mining process (source selection, text retrieval, information extraction, data storage, data mining, presentation). It then presents some of the most widely used data and text mining techniques, including clustering and classification methods (nearest neighbor, relational learning models, genetic algorithms) and dependency models (graph-theoretic link analysis, linear regression and decision trees, nonlinear regression and neural networks). The survey finally illustrates some of their potential by describing the Office of Naval Research text mining pilot program. In the first year of that program, existing metadata from commercial bibliographic databases was used. There is presently an unacceptably long delay between the development of key component technologies for textual data mining and the deployment of the integrated tools that S&T sponsors need. The first year of the ONR text mining pilot program represents an initial attempt to bridge that gap. Important lessons have been learned about the use of text mining for management of science and technology research, but much remains to be done.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predictive Data Mining: A Generalized Approach

Abstract— In this paper, we included the ambitious task of formulating a general framework of data mining. We explained that the framework should fulfil. It should elegantly handle different types of data, different data mining tasks, and different types of patterns/models. We also discuss data mining languages and what they should support: this includes the design and implementation of data mi...

متن کامل

Application Of Data Mining In Bioinformatics

This article highlights some of the basic concepts of bioinformatics and data mining. The major research areas of bioinformatics are highlighted. The application of data mining in the domain of bioinformatics is explained. It also highlights some of the current challenges and opportunities of data mining in bioinformatics.

متن کامل

Algorithms for Mining Association Rules: An Overview

In this paper, we provide the basic concepts about association rule mining and compared existing algorithms for association rule mining techniques. Of course, a single article cannot describe all the algorithms in detailed, yet we tried to cover the major theoretical issues, which can help the researcher in their researches. KeywordsAssociation rules, algorithm, itemsets, database.

متن کامل

A scalable mining of frequent quadratic concepts in d-folksonomies

Folksonomy mining is grasping the interest of web 2.0 community since it represents the core data of social resource sharing systems. However, a scrutiny of the related works interested in mining folksonomies unveils that the time stamp dimension has not been considered. For example, the wealthy number of works dedicated to mining tri-concepts from folksonomies did not take into account time di...

متن کامل

Increasing Performance of Rule Mining in the Medical Domain Using Natural Intelligence Concepts

This paper discusses how concepts derived from nature can be applied successfully to improve the performance of the rule mining process. These concepts are derived from swarm intelligence and behavior of frogs. Swarm Intelligence (SI) is the property of a system whereby the collective behavior of agents interacting locally with their environment causes coherent functional global patterns to eme...

متن کامل

Chapter 8 INTRODUCTION TO SUPERVISED METHODS

This chapter summarizes the fundamental aspects of supervised methods. The chapter provides an overview of concepts from various interrelated fields used in subsequent chapters. It presents basic definitions and arguments from the supervised machine learning literature and considers various issues, such as performance evaluation techniques and challenges for data mining tasks.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003